parallelizing `wyoung` resampling command

Jack Reimer

Join Date: Sep 2018

Posts: 52
#1

parallelizing `wyoung` resampling command

31 Mar 2022, 14:28

I am attempting to calculate adjusted p-values using the resampling methodology of Westfall and Young (1993). Fortunately, there is a handy and robust package `wyoung` that can perform this: https://github.com/reifjulian/wyoung

Unfortunately, my data are relatively large and I am running fairly parsimonious regressions so it's taking a very long time. Example:

Code:

local yvars "outcome1 outcome2 outcome3 outcome4 outcome5 outcome6 outcome7 outcome8" wyoung `yvars', /// cmd(reg OUTCOMEVAR explanatory_1 explanatory_2 explanatory_3 /// explanatory_4 explanatory_5 explanatory_6 /// explanatory_7 explanatory_8 explanatory_9 /// explanatory_10 explanatory_11, vce(clu hhd_index)) /// familyp(explanatory_1 explanatory_2 explanatory_3 /// explanatory_4 explanatory_5 explanatory_6 /// explanatory_7 explanatory_6 explanatory_9 /// explanatory_10 explanatory_11) /// seed(33) boot(10000) cluster(my_cluster) replace

The above command takes just over 4 days to run on a slurm cluster on a single node. I want to parallelize this code to decrease runtime. I've tried investigating this `parallel` package (https://github.com/gvegayon/parallel) but I have not successfully adapted it to this `wyoung` process.

1. Is there a way to parallelize this code using `parallel`?
2. Is there another means by which I can decrease runtime?

Last edited by Jack Reimer; 31 Mar 2022, 14:33.
Tags: bootstrap, parallelize, resampling, runtime, wyoung
Julian Reif

Join Date: Dec 2018

Posts: 54
#2

14 Jun 2022, 05:29

I have not tried using parallel. However, note that the vast majority of the computation time is taken up bythe regress command, which scales nearly perfectly with the number of processors (see Figure 4). Thus I do not think that the parallel package will help, unfortunately, unless your bottleneck is the number of cores in your Stata license. In that case the easiest (but also most expensive) option is to buy a license with more cores.

Associate Professor of Finance and Economics
University of Illinois
www.julianreif.com
Comment

Announcement

parallelizing `wyoung` resampling command

Comment